ArviZ 1.0

Refactoring for flexibility, extensibility, and power

Osvaldo Martin

Aalto University

2025-11-11

What is ArviZ?


The bass player of Bayesian modelling (with Python)

  • Plays a key supporting role:
    • Model evaluation
    • Model comparison
    • MCMC diagnostics
    • Presentation of results

The origin story


  • A PyMC’s spin-off
  • March 2018 we decide to move out diagnostics and plotting out of PyMC code base

Why a refactoring?

  • We were never really happy with the design, but people needed the tools!
  • A few issues:
    • dependencies: sometimes NumPy is all you need
    • messy arguments: flexibility scaled with number of arguments.
    • hard to reuse code: both for us and for users.
    • duplicated code: similar functionality implemented twice (matplotlib/bokeh).

Divide and conquer

All ArviZ functionality is available in 3 packages

  • arviz-base: I/O and datastructure manipulation
  • arviz-stats: for statistical functions and diagnostics
  • arviz-plots: visual checks

Note

We expect most users to install everything together, so for them ArviZ is still one package.

ArviZ-base

import arviz_base as azb
dt = azb.load_arviz_data("anes")
dt
<xarray.DatasetView> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

ArviZ-stats

import arviz_stats as azs

azs.ess(dt)
<xarray.DatasetView> Size: 6kB
Dimensions:           (party_id_dim: 2, party_id:age_dim: 3, __obs__: 373)
Coordinates:
  * party_id_dim      (party_id_dim) <U11 88B 'independent' 'republican'
  * party_id:age_dim  (party_id:age_dim) <U11 132B 'democrat' ... 'republican'
  * __obs__           (__obs__) int64 3kB 0 1 2 3 4 5 ... 368 369 370 371 372
Data variables:
    Intercept         float64 8B 5.92e+03
    party_id          (party_id_dim) float64 16B 5.678e+03 5.846e+03
    party_id:age      (party_id:age_dim) float64 24B 5.469e+03 ... 5.615e+03
    p                 (__obs__) float64 3kB 7.12e+03 6.478e+03 ... 7.984e+03

ArviZ-stats array-interface

from arviz_stats.base import array_stats

# generate mock MCMC-like data
rng = np.random.default_rng()
samples = rng.normal(size=(4, 1000, 3))

array_stats.ess(samples, chain_axis=0, draw_axis=1)
array([3714.58251871, 3983.81458725, 3999.80437052])

ArviZ-plots


azp.plot_dist(dt, var_names=["~p"]).show()

ArviZ-plots


azp.plot_dist(dt, var_names=["~p"],
              visuals={"dist": {"color": "C5", "linestyle": "C2"},
                       "credible_interval": False,
                       "point_estimate": False,
                       "point_estimate_text": False,
                      }).show()

ArviZ-plots

azp.plot_dist(dt, var_names=["Intercept", "party_id:age"],
    kind = "ecdf",
    visuals ={"point_estimate_text": False},
    cols = ["__variable__"],
    aes = {"color": ["party_id:age_dim"], "y": ["party_id:age_dim"]},
    y = np.linspace(0, 0.1, 4),
    aes_by_visuals={
        "dist": ["color"],
        "point_estimate": ["color", "y"],
        "credible_interval": ["y"]
    },
).show()

Where are we now?

  • not yet at 1.0, but really close
  • if you are already using ArviZ you can try
pip install -U arviz
pip install arviz-plots["backend"] 
# backend ∈ {"matplotlib", "plotly", "bokeh"} 


and then:

import arviz as az  # old arviz
import arviz.preview as azp  # new arviz